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Method and Apparatus for Representing and Searching for an Object in 

an Image 

The present invention relates to the representation of an object 
appearing in a still or video image, such as an image stored in a multimedia 
database, especially for searching purposes, and to a method and apparatus for 
searching for an object using such a representation. 

In applications such as image or video libraries, it is desirable to have 
an efficient representation and storage of the outline or shape of objects or 
parts of objects appearing in still or video images. A knovm technique for 
shape-based indexing and retrieval uses Curvature Scale Space (CSS) 
representation. Details of the CSS representation can be found in the papers 
"Robust and Efficient Shape Indexing through Curvature Scale Space" Proc. 
British Machine Vision conference, pp 53-62, Edinburgh, UK, 1996 and 
"Indexing an Image Database by Shape Content using Curvature Scale Space" 
Proc, lEE Colloquium on Intelligent Databases, London 1996, both by F. 
MQkhtarian,JS.._Abhasi_andJ._Kitder,Jthe_contents_of^^ 
herein by reference. 

The CSS representation uses a curvature function for the outline of the 
object, starting from an arbitrary point on the outline. The curvature function 
is studied as the outline shape is evolved by a series of deformations which 
smooth the shape. More specifically, the zero crossings of the derivative of 
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the curvature fiinction convolved with a family of Gaussian filters are 
computed. The zero crossings are plotted on a graph, known as the Curvature 
Scale Space, where the x-axis is the normalised arc-length of the curve and the 
y-axis is the evolution parameter, specifically, the parameter of the filter 
applied. The plots on the graph form loops characteristic of the outline. Each 
convex or concave part of the object outline corresponds to a loop in the CSS 

image. The co-ordinates of the peaks of the most prominent loops in the CSS 
image are used as a representation of the outline. 

To search for objects in images stored in a database matching the 
shape of an input object, the CSS representation of an input shape is 
calculated. The similarity between an input shape and stored shapes is 
determined by comparing the position and height of the peaks in the 
respective CSS images using a matching algorithm. 

It is also known fi-om the first-mentioned paper above to use two 
additional parameters, circularity and eccentricity of the original shape, to 
reject from the matching process shapes with significantly different circularity 
and eccentricity parameters. 



A problem with the representation as described above is that retrieval 
accuracy is sometimes poor, especially for curves which have a small number 
of concavities or convexities. In particular, the representation cannot 
distinguish between various convex curves. 



An aspect of the present invention is to introduce an additional means 
of describing the shape of the "prototype contour shape". The prototype 
contour shape is defined here preferably as: 

1) The origind shape if there are no convexities or concavities in 
5 the contour (i.e. there are no peaks in the CSS image), or 

2) The contour of the shape after smoothing equivalent to the 
highest peak in the CSS image. 

Note, that the prototype contour shape is always convex. 
For example, the shape of the prototype contour can be described by 
10 means of the invariants based on region moments as described in the paper 
"Visual Pattern Recognition by Moments Invariants", IEEE Transaction on 
Information Theory, Vol. IT-8, 179-187, 1962 by M.K. Hu the contents of 
which are incorporated herein by reference or using the Fourier descriptors as 
described in the paper "On Image Analysis by the Methods of Moments", 

15 IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 10, 
No. 4, July 1988, by Cho-Huak The, the contents of which are incorporated 

herein by reference, or parameters such as eccentricity, circularity, etc. In the 

known method mentioned above, eccentricity and circularity is only used in 
relation to the original shape. Here we use it in relation to a "prototype 

20 shape", which is different for curves which have at least one CSS peak. 
Another difference is that in the known method eccentricity and circularity are 
used to reject certain shapes firom the similarity matching, and here we use 
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them (in addition to CSS peaks) to derive the value of the similarity measure. 
Finally, we extend the additional parameters used in the matching process to 
the moment invariants, Fourier descriptors and Zemike Moments. 

As a result of the invention, the retrieval accuracy can be improved. 

Embodiments of the present invention will be described with reference 
to the accompanying drawings of which: 

Fig. 1 is a block diagram of a video database system; 



Fig. 2 is a drawing of an outline of an object; and 
Fig. 3 is a CSS representation of the outline of Fig. 2. 
Fig. 1 shows a computerised video database system according to an 
embodiment of the invention. The system includes a control unit 2 in the 
form of a computer, a display unit 4 in the form of a monitor, a pointing 
device 6 in the form of a mouse, ah image database 8 including stored still and 
video images and a descriptor database 10 storing descriptors of objects or 
1 5 parts of objects appearing in images stored in the image database 8. 

A descriptor for the shape of each object of interest appearing in an 
image m the image database is derived by the control unit 2 and stored in the 



descriptor database 10. The control unit 2 derives the descriptors operating 
under the control of a suitable program implementing a method as described 
20 below. 



Firstly, for a given object outline, a CSS representation of the outline 
is derived. This is done using the known method as described in one of the 
papers mentioned above. 

More specifically, the outline is expressed by a representation 
5 ^ = {(x(u), y(u), u € [0, 1]} where u is a normalised arc length parameter. 

The outline is smoothed by convolving 4^ with an ID Gaussian kernel 
g(u, a), and the curvature zero crossings of the evolving cxirve are examined 
as c changes. The zero crossing are identified using the following expression 
for the curvature: 

where 

X(u, a) = x(w) * g(u, a) Y(u, &) = y(u) * g(u, a) 

and 

= x{u) * g„ (w, a) X^^ (m, &) = x{u) * ( w, a) 
15 In the above, * represents convolution and subscripts represent 

derivatives. 

The number of curvature zero crossings changes as a changes, and 
when a is sufficiently high T is a convex curve with no zero crossings. 

The zero crossing points (u, cr) are plotted on a graph , known as the 
20 CSS image space. This results in a plurality of curves characteristic of the 
original outline. The peaks of the characteristic curves are identified and the 



corresponding co-ordinates are extracted and stored. In general terms, this 
gives a set of n co-ordinate pairs [(xl,yl), (x2,y2), ....(xn,yn)], where n is the 
number of peaks, and xi is the arc-length position of the ith peak and yi is the 
peak height. These peak co-ordinates constitute the CSS representation. 

In addition to the CSS representation, further parameters are 
associated with the shape to produce the shape descriptor. In this 
embodiment, the additional parameters are the eccentricity and circularity of 
the "prototype region" for the shape, where the "prototype region" of the 
shape is the contour of the shape after the final smoothing step, that is, at the 
point equivalent to the highest peak value a. Other values of a can be 
selected for the prototype region. This results in a shape descriptor for a 
shape S in the form: {EPR, CPU, PEAKS} where EPR represents the 

eccentricity of the prototype region, CPR the circularity of the prototype 

region, and PEAKS the CSS representation. 

A method of searching for an object in an image in accordance with an 

embodiment of the invention will now be described. 

Here, the descriptor database 10 of the system of Fig. 1 stores shape 

descriptors derived according to the method described above. 

The user mitiates a search by drawing an object outline on the display 

using the pointing device. The control unit 2 then derives a shape descriptor 

of the input outline in the manner described above. The control unit then 



perfonns a matching comparison with each shape descriptor stored in the 
database. 

Suppose the input outline. Shape S 1 , is being compared with a stored 
shape S2, SI and S2 being respective descriptors: 
5 SI: {EPR1,CPR1, PEAKS 1} 

S2: {EPR2, CPR2, PEAKS2} 

Where EPR means Eccentricity of the prototype region and CPR 
means Circularity of the prototype region, and PEAKS means the set of 
coordinates of peaks in the CSS image (the set can be empty). The similarity 
10 measure between two shapes is computed as follows. 

M= a*abs((EPR2-EPRl)/(EPR2+EPRl)) + 6*abs((CPR2-CPRl)/ 
((CPR2+CPR1))+SM(PEAKS1, PEAKS2) 

Where a and b are two coefficients and SM is the standard similarity 
measure defined on the two sets of peaks [1], and abs denotes absolute value. 
15 SM is calculated using a known matching algorithm such as described in the 
above-mentioned papers can be used. That matching procedure is briefly 

described below. 

Given two closed contour shapes, the image curve ^i and the model 
curve ^m and their respective sets of peaks {(xil,yil),(xi2,yi2),..,(xin,yin)} 
20 and {(xml,yml), (xm2,ym2), ...,(xmn,ymn)} the sunilarity measure is 

calculated. The similarity measure is defined as a total cost of matching of 

✓ 

peaks in the model into peaks in the image. The matching which minimises 
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the total cost is detennined using a dynamic programming. The algorithm 
recursively matches the peaks from the model to the peaks from the image and 
calculates the cost of each such match. Each model peak can be matched with 
only one image peak and each image peak can be matched with only one 
model peak. Some of the model and or image peak may remain unmatched, 
and there is an additional penalty cost for each unmatched peak. Two peaks 
can be matched if their horizontal distance is less then 0.2. The cost of a 
match is the length of the straight line between the two matched peaks. The 
cost of an unmatched peak is its height. 

In more detail the algorithm works by creating and expanding a tree- 
like structure, where nodes correspond to matched peaks: 

1. Create starting node consisting of the largest maximum of the 
image (xik, yik) and the largest maximum of the model (xir,yir). 

2. For each remaining model peak which is within 80 percent of 
the largest maximum of the image peaks create an additional starting node. 

3. Initialise the cost of each starting node created in 1 and 2 to the 
absolute difference of the y-coordinate of the image and model peaks linked 
by this node. 

4. For each starting node in 3, compute the CSS shift parameter 
alpha, defined as the difference in the x (horizontal) coordinates of the model 
and image peaks matched in this starting node. The shift parameter will be 
different for each node. 



5. For each starting node, create a list of model peaks and a list of 
image peaks. The list hold information which peaks are yet to be matched. For 
each starting node mark peaks matched in this node as "matched", and all 
other peaks as "unmatched". 
5 6. Recursively expand a lowest cost node (starting from each 

node created in steps 1-6 and following with its children nodes) until the 
condition in point 8 is fiilfilled. To expand a node use the following 
procedure: 

7. Expanding a node: 
10 If there is at least one image and one model peak left 

unmatched: 

select the largest scale image curve CSS maximum which is 
not matched (xip,yip). Apply the starting node shift parameter (computed in 
step 4) to map the selected maximum to the model CSS image - now the 

15 selected peak has coordinates (xip-alpha, yip). Locate the nearest model curve 
peak which is unmatched (xms,yms). If the horizontal distance between the 

two peaks is less then 0.2 (i.e: |xip-alpha- xms | < 0.2), match the two peaks 

and define the cost of the match as the length of the straight line between the 
two peaks. Add the cost of the match to the total cost of that node. Remove 

20 the matched peaks from the respective lists by marking them as "matched". If 
the horizontal distance between the two peaks is greater than 0.2, the image 
peak (xip,yip) carmot be matched. In that case add its height yip to the total 
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cost and remove only the peak (xip,yip) from the image peak list by marking 
it as "matched". 

Otherwise (There are only image peaks or there are only model 
peaks left unmatched): 

Define the cost of the match as the height of the highest 
unmatched image or model peak and remove that peak from the list. 

8. If after expanding a node in 7 there are no unmatched peaks in 
both the image and model lists, the matehing procedure is terminated. The 
cost of this node is the similarity measure between the image and model 
curve. Otherwise, go to point 7 and expand the lowest cost node. 

.The above procedure is repeated with the image curve peaks and the 
model curve peaks swapped. The final matching value is the lower of the two. 
The above steps are repeated for each model in the database. 
The similarity measures resulting from the matching comparisons are 
ordered and the objects corresponding to the descriptors having similarity 
measures indicating the closest match (i.e. here the lowest similarity 
measures) are then displayed on the display unit 4 for the user. The number 



of objects to be displayed can be pre-set or selected by the user. 

In an alternative implementation, different parameters can be used to 
describe the shape of the "prototype region". For example three Fourier 
coefficients of the curve can be used. The similarity measure can be defined 
as follows: 



M = a*EUC(Fl,F2) + SM(PEAKS1, PEAKS2) 
Where EUC is a Euclidean distance between vectors Fl and F2 formed from 
three main Fourier Coefficients of the model and image shape, a is a constant, 
and SM represents the similarity measure for the CSS peaks, calculated using 
5 a method essentially as described above. 

A system according to the invention may, for example, be provided in 
an image library. Alternatively, the databases may be sited remote from the 
control unit of the system, connected to the control unit by a temporary link 
such as a telephone line or by a network such as the internet. The image and 
10 descriptor databases may be provided, for example, in permanent storage or 
on portable data storage media such as CD-ROMs or DVDs. 

Components of the system as described may be provided in software 
or hardware form. Although the invention has been described in the form of a 
computer system, it could be implemented in other forms, for example using a 
15 dedicated chip. 

Specific examples have been given of methods of representing a 2D 

shape of an object and of methods for calculating values representing 

similarities between two shapes but any suitable such methods can be used. 

The invention can also be used, for example, for matching images of 
20 objects for verification purposes, or for filtering. 



CLAIMS: 
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1 . A method of representing an object appearing in a still or video 
image, by processing signals corresponding to the image, the method 
comprising deriving a curvature scale space (CSS) representation of the object 
outline by smoothing the object outline, deriving at least one additional 
parameter reflecting the shape or mass distribution of a smoothed version of 
the original curve, and associating the CSS representation and the additional 
parameter as a shape descriptor of the object. 

2. A method as claimed in claim 1 wherein an additional 
parameter relates to the smoothed outline corresponding to a peak in the CSS 
image. 

3. A method as claimed in claim 2 wherein an additional 
parameter relates to the smoothed outline corresponding to the highest peak in 
the CSS image. 



4. A method as claimed in any one of claims 1 to 3 wherein an 
additional parameter corresponds to the eccentricity of the outline. 



5. A method as claimed in any one of claims 1 to 4 wherein an 
additional parameter corresponds to the circularity of the outline. 



6. A method as claimed in any one of claims 1 to 5 wherein at 
5 least one additional parameter uses a region-based representation. 

7. A method as claimed in claim 6 wherein an additional 
parameter is a region moment invariant. 

10 8. A method as claimed in claim 6 or claim 7 wherein an 

additional parameter is based on Fourier descriptors. 

9. A method as claimed in claim 6 wherein an additional 
parameter is based on Zemike moments of the region enclosed by the outline. 

15 

10. A method of representing a plurality of objects appearing in a 
still or video image, by processing signals corresponding to the images, the 

method comprising, for each object outline, determining if there are 
"significant changes in curvature in the object outline, and, if there are 
20 significant changes in curvature of the object outline, then deriving a shape 
descriptor using a method as claimed in any one of claims 1 to 9 and, if there 
are no significant changes in curvature of the object outline, then deriving a 



shape descriptor including at least said additional parameter reflecting the 
shape of the object outline. 
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11. A method as claimed in claim 10 wherein the additional 
5 parameter for an object outline having no significant changes in curvature is 
based on region moment invariants, Fourier descriptors or Zemike moments 
of the outline. 

12. A method of searching for an object in a still or video image by 
processing signals coiresponding to images, the method comprising inputting 
a query in the form of a two-dimensional outline, deriving a descriptor of said 
outline using a method as claimed in any one of claims I to 11, and 
comparing said query descriptor with each descriptor for stored objects using 
a matching procedure using the CSS values and the additional parameters to 
15 derive a similarity measure, and selecting and displaying at least one result 
corresponding to an unage containing an object for which the comparison 
indicates a degree of similarity between the query and said object. 



20 



13. A method as claimed in claim 12 wherein the similarity 
measure is based on M where M = a*GP-S+CSS-S where GP-S is the 
similarity measure between additional parameters of the compared object 




outlines and CSS-S is the similarity measure between the CSS values for the 
compared object outlines, and a is a constant. 

14. A method as claimed in claim 13 where a depends on the 
5 number and height of the CSS peaks. 

15. A method as claimed in claim 13 or claim 14 where a=l when 
there are no CSS peaks associated with either outline and a=0 when at least 
one outline has; a CSS peak^ 7 

10 

16. A method of searching for an object in a still or video image by 
processing signals corresponding to images, the method comprising 
calculating a similarity measure between two object outlines using a CSS 
representation of said outlines and additional parameters reflecting the shape 

15 of or mass distribution within the original outline or a smoothed version of the 
outline. 



17. An apparatus adapted to implement a method as claimed in any 
one of claims 1 to 16. 

20 

18. A computer program for implementing a method as claimed in 
any one of claims 1 to 16. 
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19. A computer system programmed to operate according to a 
method as claimed in any one of claims 1 to 16. 

20. A computer-readable storage medium storing computer- 
executable process steps for implementing a method as claimed in any one of 
claims 1 to 16. 

21. A method of representing objects in still or video images 
substantially as hereinbefore described with reference to the accompanying 
drawings. 

22. A method of searching for objects in still or video images 
substantially as hereinbefore described with reference to the accompanying 
drawings. 



23. A computer system substantially as hereinbefore described 
with reference to the accompanying drawings. 




Method and Apparatus for Representing and Searching for an Ob ject in 

an Image 

ABSTRACT 

5 

A method of representing an object appearing in a still or video image, 
by processing signals corresponding to the image, comprises deriving a 
cxirvature scale space (CSS) representation of the object outline by smoothing 
the object outline, deriving at least one additional parameter reflecting the 
10 shape or mass distribution of a smoothed version of the original curve, and 
associating the CSS representation and the additional parameter as a shape 
descriptor of the object. 
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Figure 1. 
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FIGURE 1 
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